Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 102
Filtrar
1.
Biosystems ; 239: 105199, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38641198

RESUMO

Over the past quarter-century, the field of evolutionary biology has been transformed by the emergence of complete genome sequences and the conceptual framework known as the 'Net of Life.' This paradigm shift challenges traditional notions of evolution as a tree-like process, emphasizing the complex, interconnected network of gene flow that may blur the boundaries between distinct lineages. In this context, gene loss, rather than horizontal gene transfer, is the primary driver of gene content, with vertical inheritance playing a principal role. The 'Net of Life' not only impacts our understanding of genome evolution but also has profound implications for classification systems, the rapid appearance of new traits, and the spread of diseases. Here, we explore the core tenets of the 'Net of Life' and its implications for genome-scale phylogenetic divergence, providing a comprehensive framework for further investigations in evolutionary biology.


Assuntos
Evolução Molecular , Fluxo Gênico , Genoma , Filogenia , Genoma/genética , Animais , Humanos , Transferência Genética Horizontal , Modelos Genéticos , Evolução Biológica
2.
PLoS Comput Biol ; 19(11): e1011498, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37934729

RESUMO

Public-domain availability for bioinformatics software resources is a key requirement that ensures long-term permanence and methodological reproducibility for research and development across the life sciences. These issues are particularly critical for widely used, efficient, and well-proven methods, especially those developed in research settings that often face funding discontinuities. We re-launch a range of established software components for computational genomics, as legacy version 1.0.1, suitable for sequence matching, masking, searching, clustering and visualization for protein family discovery, annotation and functional characterization on a genome scale. These applications are made available online as open source and include MagicMatch, GeneCAST, support scripts for CoGenT-like sequence collections, GeneRAGE and DifFuse, supported by centrally administered bioinformatics infrastructure funding. The toolkit may also be conceived as a flexible genome comparison software pipeline that supports research in this domain. We illustrate basic use by examples and pictorial representations of the registered tools, which are further described with appropriate documentation files in the corresponding GitHub release.


Assuntos
Genômica , Software , Reprodutibilidade dos Testes , Genômica/métodos , Biologia Computacional/métodos , Genoma
3.
Nature ; 622(7983): 594-602, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37821698

RESUMO

Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities1,2. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database3. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matter.


Assuntos
Metagenoma , Metagenômica , Microbiologia , Proteínas , Análise por Conglomerados , Metagenoma/genética , Metagenômica/métodos , Proteínas/química , Proteínas/classificação , Proteínas/genética , Bases de Dados de Proteínas , Conformação Proteica
4.
J Mol Evol ; 91(4): 471-481, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37039856

RESUMO

Selenium-binding proteins represent a ubiquitous protein family and recently SBP1 was described as a new stress response regulator in plants. SBP1 has been characterized as a methanethiol oxidase, however its exact role remains unclear. Moreover, in mammals, it is involved in the regulation of anti-carcinogenic growth and progression as well as reduction/oxidation modulation and detoxification. In this work, we delineate the functional potential of certain motifs of SBP in the context of evolutionary relationships. The phylogenetic profiling approach revealed the absence of SBP in the fungi phylum as well as in most non eukaryotic organisms. The phylogenetic tree also indicates the differentiation and evolution of characteristic SBP motifs. Main evolutionary events concern the CSSC motif for which Acidobacteria, Fungi and Archaea carry modifications. Moreover, the CC motif is harbored by some bacteria and remains conserved in Plants, while modified to CxxC in Animals. Thus, the characteristic sequence motifs of SBPs mainly appeared in Archaea and Bacteria and retained in Animals and Plants. Our results demonstrate the emergence of SBP from bacteria and most likely as a methanethiol oxidase.


Assuntos
Proteínas , Proteínas de Ligação a Selênio , Animais , Proteínas de Ligação a Selênio/genética , Proteínas de Ligação a Selênio/metabolismo , Filogenia , Bactérias/genética , Bactérias/metabolismo , Archaea/genética , Archaea/metabolismo , Plantas , Oxirredutases/genética , Mamíferos/metabolismo
5.
F1000Res ; 12: 198, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37082000

RESUMO

Background: The evolutionary rate of disordered proteins varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of intrinsically disordered regions (IDRs) across phylogenies in conjunction with human disease. In this study, we report a genome-wide analysis of compositional bias association with disease in human proteins and their taxonomic distribution. Methods: The human genome protein set provided by the Ensembl database was annotated and analysed with respect to both disease associations and the detection of compositional bias. The Uniprot Reference Proteome dataset, containing 11297 proteomes was used as target dataset for the comparative genomics of a well-defined subset of the Human Genome, including 100 characteristic, compositionally biased proteins, some linked to disease. Results: Cross-evaluation of compositional bias and disease-association in the human genome reveals a significant bias towards low complexity regions in disease-associated genes, with charged, hydrophilic amino acids appearing as over-represented. The phylogenetic profiling of 17 disease-associated, low complexity proteins across 11297 proteomes captures characteristic taxonomic distribution patterns. Conclusions: This is the first time that a combined genome-wide analysis of low complexity, disease-association and taxonomic distribution of human proteins is reported, covering structural, functional, and evolutionary properties. The reported framework can form the basis for large-scale, follow-up projects, encompassing the entire human genome and all known gene-disease associations.


Assuntos
Genômica , Proteoma , Humanos , Proteoma/genética , Filogenia , Genoma Humano , Viés
6.
NAR Genom Bioinform ; 5(1): lqad025, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-36968432

RESUMO

The nuclear pore complex exhibits different manifestations across eukaryotes, with certain components being restricted to specific clades. Several studies have been conducted to delineate the nuclear pore complex composition in various model organisms. Due to its pivotal role in cell viability, traditional lab experiments, such as gene knockdowns, can prove inconclusive and need to be complemented by a high-quality computational process. Here, using an extensive data collection, we create a robust library of nucleoporin protein sequences and their respective family-specific position-specific scoring matrices. By extensively validating each profile in different settings, we propose that the created profiles can be used to detect nucleoporins in proteomes with high sensitivity and specificity compared to existing methods. This library of profiles and the underlying sequence data can be used for the detection of nucleoporins in target proteomes.

7.
Nat Commun ; 13(1): 915, 2022 02 17.
Artigo em Inglês | MEDLINE | ID: mdl-35177626

RESUMO

Quantitative or qualitative differences in immunity may drive clinical severity in COVID-19. Although longitudinal studies to record the course of immunological changes are ample, they do not necessarily predict clinical progression at the time of hospital admission. Here we show, by a machine learning approach using serum pro-inflammatory, anti-inflammatory and anti-viral cytokine and anti-SARS-CoV-2 antibody measurements as input data, that COVID-19 patients cluster into three distinct immune phenotype groups. These immune-types, determined by unsupervised hierarchical clustering that is agnostic to severity, predict clinical course. The identified immune-types do not associate with disease duration at hospital admittance, but rather reflect variations in the nature and kinetics of individual patient's immune response. Thus, our work provides an immune-type based scheme to stratify COVID-19 patients at hospital admittance into high and low risk clinical categories with distinct cytokine and antibody profiles that may guide personalized therapy.


Assuntos
Anticorpos Antivirais/sangue , COVID-19/patologia , Citocinas/sangue , SARS-CoV-2/imunologia , Índice de Gravidade de Doença , Idoso , Proteínas do Nucleocapsídeo de Coronavírus/imunologia , Progressão da Doença , Feminino , Hospitalização , Humanos , Imunoglobulina A/sangue , Imunoglobulina G/sangue , Imunoglobulina M/sangue , Imunofenotipagem/métodos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Fosfoproteínas/imunologia
8.
Environ Res ; 207: 112183, 2022 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-34637759

RESUMO

In urban ecosystems, microbes play a key role in maintaining major ecological functions that directly support human health and city life. However, the knowledge about the species composition and functions involved in urban environments is still limited, which is largely due to the lack of reference genomes in metagenomic studies comprises more than half of unclassified reads. Here we uncovered 732 novel bacterial species from 4728 samples collected from various common surface with the matching materials in the mass transit system across 60 cities by the MetaSUB Consortium. The number of novel species is significantly and positively correlated with the city population, and more novel species can be identified in the skin-associated samples. The in-depth analysis of the new gene catalog showed that the functional terms have a significant geographical distinguishability. Moreover, we revealed that more biosynthetic gene clusters (BGCs) can be found in novel species. The co-occurrence relationship between BGCs and genera and the geographical specificity of BGCs can also provide us more information for the synthesis pathways of natural products. Expanded the known urban microbiome diversity and suggested additional mechanisms for taxonomic and functional characterization of the urban microbiome. Considering the great impact of urban microbiomes on human life, our study can also facilitate the microbial interaction analysis between human and urban environment.


Assuntos
Metagenoma , Microbiota , Bactérias/genética , Humanos , Metagenômica , Interações Microbianas , Microbiota/genética
9.
Nucleic Acids Res ; 50(D1): D480-D487, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34850135

RESUMO

The Database of Intrinsically Disordered Proteins (DisProt, URL: https://disprot.org) is the major repository of manually curated annotations of intrinsically disordered proteins and regions from the literature. We report here recent updates of DisProt version 9, including a restyled web interface, refactored Intrinsically Disordered Proteins Ontology (IDPO), improvements in the curation process and significant content growth of around 30%. Higher quality and consistency of annotations is provided by a newly implemented reviewing process and training of curators. The increased curation capacity is fostered by the integration of DisProt with APICURON, a dedicated resource for the proper attribution and recognition of biocuration efforts. Better interoperability is provided through the adoption of the Minimum Information About Disorder (MIADE) standard, an active collaboration with the Gene Ontology (GO) and Evidence and Conclusion Ontology (ECO) consortia and the support of the ELIXIR infrastructure.


Assuntos
Bases de Dados de Proteínas , Proteínas Intrinsicamente Desordenadas/metabolismo , Anotação de Sequência Molecular , Software , Sequência de Aminoácidos , DNA/genética , DNA/metabolismo , Conjuntos de Dados como Assunto , Ontologia Genética , Humanos , Internet , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/genética , Ligação Proteica , RNA/genética , RNA/metabolismo
10.
Viruses ; 13(4)2021 03 29.
Artigo em Inglês | MEDLINE | ID: mdl-33805449

RESUMO

The Covid-19 pandemic has required nonpharmaceutical interventions, primarily physical distancing, personal hygiene and face mask use, to limit community transmission, irrespective of seasons. In fact, the seasonality attributes of this pandemic remain one of its biggest unknowns. Early studies based on past experience from respiratory diseases focused on temperature or humidity, with disappointing results. Our hypothesis that ultraviolet (UV) radiation levels might be a factor and a more appropriate parameter has emerged as an alternative to assess seasonality and exploit it for public health policies. Using geographical, socioeconomic and epidemiological criteria, we selected twelve North-equatorial-South countries with similar characteristics. We then obtained UV levels, mobility and Covid-19 daily incidence rates for nearly the entire 2020. Using machine learning, we demonstrated that UV radiation strongly associated with incidence rates, more so than mobility did, indicating that UV is a key seasonality indicator for Covid-19, irrespective of the initial conditions of the epidemic. Our findings can inform the implementation of public health emergency measures, partly based on seasons in the Northern and Southern Hemispheres, as the pandemic unfolds into 2021.


Assuntos
COVID-19/epidemiologia , COVID-19/virologia , SARS-CoV-2/efeitos da radiação , Humanos , Incidência , Aprendizado de Máquina , Pandemias , SARS-CoV-2/fisiologia , Estações do Ano , Temperatura , Raios Ultravioleta , Tempo (Meteorologia)
11.
mBio ; 12(1)2021 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-33468697

RESUMO

Orf8, one of the most puzzling genes in the SARS lineage of coronaviruses, marks a unique and striking difference in genome organization between SARS-CoV-2 and SARS-CoV-1. Here, using sequence comparisons, we unequivocally reveal the distant sequence similarities between SARS-CoV-2 Orf8 with its SARS-CoV-1 counterparts and the X4-like genes of coronaviruses, including its highly divergent "paralog" gene Orf7a, whose product is a potential immune antagonist of known structure. Supervised sequence space walks unravel identity levels that drop below 10% and yet exhibit subtle conservation patterns in this novel superfamily, characterized by an immunoglobulin-like beta sandwich topology. We document the high accuracy of the sequence space walk process in detail and characterize the subgroups of the superfamily in sequence space by systematic annotation of gene and taxon groups. While SARS-CoV-1 Orf7a and Orf8 genes are most similar to bat virus sequences, their SARS-CoV-2 counterparts are closer to pangolin virus homologs, reflecting the fine structure of conservation patterns within the SARS-CoV-2 genomes. The divergence between Orf7a and Orf8 is exceptionally idiosyncratic, since Orf7a is more constrained, whereas Orf8 is subject to rampant change, a peculiar feature that may be related to hitherto-unknown viral infection strategies. Despite their common origin, the Orf7a and Orf8 protein families exhibit different modes of evolutionary trajectories within the coronavirus lineage, which might be partly attributable to their complex interactions with the mammalian host cell, reflected by a multitude of functional associations of Orf8 in SARS-CoV-2 compared to a very small number of interactions discovered for Orf7a.IMPORTANCE Orf8 is one of the most puzzling genes in the SARS lineage of coronaviruses, including SARS-CoV-2. Using sophisticated sequence comparisons, we confirm its origins from Orf7a, another gene in the lineage that appears as more conserved, compared to Orf8. Orf7a is a potential immune antagonist of known structure, while a deletion of Orf8 was shown to decrease the severity of the infection in a cohort study. The subtle sequence similarities imply that Orf8 has the same immunoglobulin-like fold as Orf7a, confirmed by structure determination. We characterize the subgroups of this superfamily and demonstrate the highly idiosyncratic divergence patterns during the evolution of the virus.


Assuntos
COVID-19/imunologia , Evasão da Resposta Imune , SARS-CoV-2/genética , SARS-CoV-2/imunologia , Proteínas Virais/imunologia , Animais , COVID-19/virologia , Bases de Dados Genéticas , Evolução Molecular , Genoma Viral , Humanos , Filogenia , Alinhamento de Sequência , Proteínas Virais/genética
12.
Big Data ; 9(1): 63-71, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-32991205

RESUMO

As high-throughput approaches in biological and biomedical research are transforming the life sciences into information-driven disciplines, modern analytics platforms for big data have started to address the needs for efficient and systematic data analysis and interpretation. We observe that radiobiology is following this general trend, with -omics information providing unparalleled depth into the biomolecular mechanisms of radiation response-defined as systems radiobiology. We outline the design of computational frameworks and discuss the analysis of big data in low-dose ionizing radiation (LDIR) responses of the mammalian brain. Following successful examples and best practices of approaches for the analysis of big data in life sciences and health care, we present the needs and requirements for radiation research. Our goal is to raise awareness for the radiobiology community about the new technological possibilities that can capture complex information and execute data analytics on a large scale. The production of large data sets from genome-wide experiments (quantity) and the complexity of radiation research with multidimensional experimental designs (quality) will necessitate the adoption of latest information technologies. The main objective was to translate research results into applied clinical and epidemiological practice and understand the responses of biological tissues to LDIR to define new radiation protection policies. We envisage a future where multidisciplinary teams include data scientists, artificial intelligence experts, DevOps engineers, and of course radiation experts to fulfill the augmented needs of the radiobiology community, accelerate research, and devise new strategies.


Assuntos
Inteligência Artificial , Big Data , Animais , Radiobiologia , Projetos de Pesquisa
13.
Comput Struct Biotechnol J ; 18: 4093-4102, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33363705

RESUMO

The genome of SARS-CoV-2, the coronavirus responsible for the Covid-19 pandemic, encodes a number of accessory genes. The longest accessory gene, Orf3a, plays important roles in the virus lifecycle indicated by experimental findings, known polymorphisms, its evolutionary trajectory and a distinct three-dimensional fold. Here we show that supervised, sensitive database searches with Orf3a detect weak, yet significant and highly specific similarities to the M proteins of coronaviruses. The similarity profiles can be used to derive low-resolution three-dimensional models for M proteins based on Orf3a as a structural template. The models also explain the emergence of Orf3a from M proteins and suggest a recent origin across the coronavirus lineage, enunciated by its restricted phylogenetic distribution. This study provides evidence for the common origin of M and Orf3a families and proposes for the first time a working model for the structure of the universally distributed M proteins in coronaviruses, consistent with the properties of both protein families.

14.
Microb Genom ; 6(11)2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-32924924

RESUMO

As genome sequencing efforts are unveiling the genetic diversity of the biosphere with an unprecedented speed, there is a need to accurately describe the structural and functional properties of groups of extant species whose genomes have been sequenced, as well as their inferred ancestors, at any given taxonomic level of their phylogeny. Elaborate approaches for the reconstruction of ancestral states at the sequence level have been developed, subsequently augmented by methods based on gene content. While these approaches of sequence or gene-content reconstruction have been successfully deployed, there has been less progress on the explicit inference of functional properties of ancestral genomes, in terms of metabolic pathways and other cellular processes. Herein, we describe PathTrace, an efficient algorithm for parsimony-based reconstructions of the evolutionary history of individual metabolic pathways, pivotal representations of key functional modules of cellular function. The algorithm is implemented as a five-step process through which pathways are represented as fuzzy vectors, where each enzyme is associated with a taxonomic conservation value derived from the phylogenetic profile of its protein sequence. The method is evaluated with a selected benchmark set of pathways against collections of genome sequences from key data resources. By deploying a pangenome-driven approach for pathway sets, we demonstrate that the inferred patterns are largely insensitive to noise, as opposed to gene-content reconstruction methods. In addition, the resulting reconstructions are closely correlated with the evolutionary distance of the taxa under study, suggesting that a diligent selection of target pangenomes is essential for maintaining cohesiveness of the method and consistency of the inference, serving as an internal control for an arbitrary selection of queries. The PathTrace method is a first step towards the large-scale analysis of metabolic pathway evolution and our deeper understanding of functional relationships reflected in emerging pangenome collections.


Assuntos
Algoritmos , Bactérias/genética , Bactérias/metabolismo , Evolução Molecular , Genoma/genética , Redes e Vias Metabólicas/genética , Sequência de Aminoácidos , Sequência de Bases , Filogenia , Software
15.
EMBO Rep ; 21(4): e50388, 2020 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-32216085

RESUMO

University accountants and administrators should support scientists going to meetings, not further burden them with bureaucratic hurdles, expense claims or unnecessary auditing.


Assuntos
Viagem , Humanos
16.
Bioinformatics ; 36(9): 2963-2965, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-32129821
17.
NAR Genom Bioinform ; 2(4): lqaa088, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-33575632

RESUMO

Ribosomal genes produce the constituents of the ribosome, one of the most conserved subcellular structures of all cells, from bacteria to eukaryotes, including animals. There are notions that some protein-coding ribosomal genes vary in their roles across species, particularly vertebrates, through the involvement of some in a number of genetic diseases. Based on extensive sequence comparisons and systematic curation, we establish a reference set for ribosomal proteins (RPs) in eleven vertebrate species and quantify their sequence conservation levels. Moreover, we correlate their coordinated gene expression patterns within up to 33 tissues and assess the exceptional role of paralogs in tissue specificity. Importantly, our analysis supported by the development and use of machine learning models strongly proposes that the variation in the observed tissue-specific gene expression of RPs is rather species-related and not due to tissue-based evolutionary processes. The data obtained suggest that RPs exhibit a complex relationship between their structure and function that broadly maintains a consistent expression landscape across tissues, while most of the variation arises from species idiosyncrasies. The latter may be due to evolutionary change and adaptation, rather than functional constraints at the tissue level throughout the vertebrate lineage.

18.
Brief Bioinform ; 21(2): 458-472, 2020 03 23.
Artigo em Inglês | MEDLINE | ID: mdl-30698641

RESUMO

There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, and more generally the overlaps between different properties related to LCRs, using examples. We argue that statistical measures alone cannot capture all structural aspects of LCRs and recommend the combined usage of a variety of predictive tools and measurements. While the methodologies available to study LCRs are already very advanced, we foresee that a more comprehensive annotation of sequences in the databases will enable the improvement of predictions and a better understanding of the evolution and the connection between structure and function of LCRs. This will require the use of standards for the generation and exchange of data describing all aspects of LCRs. SHORT ABSTRACT: There are multiple definitions for low complexity regions (LCRs) in protein sequences. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, plus overlaps between different properties related to LCRs, using examples.


Assuntos
Proteínas/química , Algoritmos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Evolução Molecular , Conformação Proteica , Domínios Proteicos
19.
Nucleic Acids Res ; 48(D1): D269-D276, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31713636

RESUMO

The Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations of intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including the doubling of protein entries, a new disorder ontology, improvements of the annotation format and a completely new website. The website includes a redesigned graphical interface, a better search engine, a clearer API for programmatic access and a new annotation interface that integrates text mining technologies. The new entry format provides a greater flexibility, simplifies maintenance and allows the capture of more information from the literature. The new disorder ontology has been formalized and made interoperable by adopting the OWL format, as well as its structure and term definitions have been improved. The new annotation interface has made the curation process faster and more effective. We recently showed that new DisProt annotations can be effectively used to train and validate disorder predictors. We believe the growth of DisProt will accelerate, contributing to the improvement of function and disorder predictors and therefore to illuminate the 'dark' proteome.


Assuntos
Bases de Dados de Proteínas , Proteínas Intrinsicamente Desordenadas/química , Ontologias Biológicas , Curadoria de Dados , Anotação de Sequência Molecular
20.
F1000Res ; 82019.
Artigo em Inglês | MEDLINE | ID: mdl-31824649

RESUMO

Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) are now recognised as major determinants in cellular regulation. This white paper presents a roadmap for future e-infrastructure developments in the field of IDP research within the ELIXIR framework. The goal of these developments is to drive the creation of high-quality tools and resources to support the identification, analysis and functional characterisation of IDPs. The roadmap is the result of a workshop titled "An intrinsically disordered protein user community proposal for ELIXIR" held at the University of Padua. The workshop, and further consultation with the members of the wider IDP community, identified the key priority areas for the roadmap including the development of standards for data annotation, storage and dissemination; integration of IDP data into the ELIXIR Core Data Resources; and the creation of benchmarking criteria for IDP-related software. Here, we discuss these areas of priority, how they can be implemented in cooperation with the ELIXIR platforms, and their connections to existing ELIXIR Communities and international consortia. The article provides a preliminary blueprint for an IDP Community in ELIXIR and is an appeal to identify and involve new stakeholders.


Assuntos
Proteínas Intrinsicamente Desordenadas/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA